1 Setup

This section loads the required R packages for data retrieval, manipulation, panel econometrics, and visualization.

library(eurostat)
library(tidyverse)
library(plm)
library(lmtest)
library(sandwich)
library(stargazer)
library(corrplot)
library(kableExtra)
library(scales)
library(moments)

2 Data Sources

We use annual panel data from Eurostat for EU-27 countries covering the period 2013–2023. The dataset includes information on GDP per capita, migration, unemployment, and investment.

Note on sample period: Raw data are available for 2013–2023, but the final analysis sample covers 2014–2023. GDP per capita growth is computed as the log difference relative to the previous year, which requires lagged observations. As a result, growth rates for 2013 cannot be computed and these observations are excluded from the regression sample.

Variable Eurostat Code Description
GDP per capita sdg_08_10 Real GDP per capita in PPS
Net Migration migr_netmigr Net migration (immigration - emigration)
Population demo_gind Total population
Unemployment une_rt_a Unemployment rate (% of active population)
Investment Rate sdg_08_11 Gross fixed capital formation (% of GDP)

3 Theoretical Background and Hypotheses

This study examines the determinants of GDP per capita growth in the European Union, focusing on three key macroeconomic variables: unemployment, investment, and migration. Each variable is grounded in established economic theory, which informs the directional hypotheses tested in the empirical analysis.

3.1 Theoretical Framework

Unemployment and Economic Growth. The negative relationship between unemployment and output growth is well-documented in macroeconomic literature, most notably through Okun’s Law [@okun1962]. Okun observed that for every percentage point increase in unemployment above the natural rate, real GDP falls by approximately 2–3 percent. This relationship reflects the direct loss of productive capacity when labor resources remain idle, as well as indirect effects through reduced consumer spending and lower aggregate demand.

Investment and Economic Growth. The Solow-Swan neoclassical growth model [@solow1956] establishes investment as a fundamental driver of long-run economic growth. Higher rates of gross fixed capital formation expand the capital stock, increase labor productivity, and shift the economy toward a higher steady-state level of output per capita. In the short to medium run, investment also stimulates aggregate demand, reinforcing its positive effect on growth.

Migration and Economic Growth. The relationship between migration and economic growth is theoretically ambiguous. On the one hand, net immigration expands the labor force, increases productive capacity, and can alleviate demographic pressures associated with aging populations—effects emphasized in the endogenous growth literature [@romer1990]. On the other hand, large immigration inflows may depress wages in the short run, strain public services, or dilute capital per worker if investment does not keep pace. Empirical evidence across EU countries suggests that the labor supply and human capital effects tend to dominate, yielding a modest positive relationship between migration and growth, particularly in economies with flexible labor markets [@docquier2014].

3.2 Hypotheses

Based on the theoretical considerations above, we formulate the following directional hypotheses:

H1 (Unemployment): There is a negative relationship between unemployment rate and GDP per capita growth. Higher unemployment represents underutilized labor resources and reduced aggregate demand, both of which suppress output growth (Okun’s Law).

H2 (Investment): There is a positive relationship between gross fixed capital formation (as a share of GDP) and GDP per capita growth. Capital accumulation increases productive capacity and drives both short-run demand effects and long-run supply-side growth (Solow growth model).

H3 (Migration): There is a positive relationship between net migration rate and GDP per capita growth. Despite theoretical ambiguity regarding the short-run effects of immigration, we hypothesize that labor force expansion and the demographic contributions of migrants outweigh any capital-dilution effects, yielding a net positive association with growth.

These hypotheses are tested using panel data methods that account for unobserved country heterogeneity and common time shocks.

3.2.1 Summary of Testable Hypotheses

For clarity, the hypotheses tested in this analysis are:

  • H1: Higher unemployment is negatively associated with GDP growth (based on Okun’s Law).
  • H2: Higher investment rates are positively associated with GDP growth (based on Solow Growth Model).
  • H3: Net migration has a positive association with economic growth (through labor supply expansion).

4 Data Preparation

This section retrieves raw data from Eurostat and processes each variable (GDP, migration, unemployment, investment) for subsequent analysis.

4.1 GDP per Capita

gdp_pc <- get_eurostat(
id = "sdg_08_10",
time_format = "num"
) %>%
select(geo, TIME_PERIOD, values) %>%
rename(
time = TIME_PERIOD,
gdp_pc = values
) %>%
filter(time >= 2013, time <= 2023)
gdp_pc <- gdp_pc %>%
  distinct(geo, time, .keep_all = TRUE)

4.2 Migration Data

migration_raw <- get_eurostat(
  id = "migr_netmigr",
  time_format = "num"
)

migration <- migration_raw %>%
  select(geo, TIME_PERIOD, values) %>%
  rename(
    time = TIME_PERIOD,
    net_migration = values
  ) %>%
  filter(time >= 2013, time <= 2023)
migration <- migration %>%
  distinct(geo, time, .keep_all = TRUE)
population_raw <- get_eurostat(
  id = "demo_gind",
  time_format = "num"
)

population <- population_raw %>%
  select(geo, TIME_PERIOD, values) %>%
  rename(
    time = TIME_PERIOD,
    population = values
  ) %>%
  filter(time >= 2013, time <= 2023) %>%
  distinct(geo, time, .keep_all = TRUE)
migration <- migration %>%
  left_join(population, by = c("geo", "time")) %>%
  mutate(
    migration_rate = 1000 * net_migration / population
  )

4.3 Unemployment Data

unemployment_raw <- get_eurostat(
id = "une_rt_a",
time_format = "num"
)

unemployment <- unemployment_raw %>%
filter(sex == "T", age == "Y15-74") %>%
select(geo, TIME_PERIOD, values) %>%
rename(
time = TIME_PERIOD,
unemployment_rate = values
) %>%
filter(time >= 2013, time <= 2023)
unemployment <- unemployment %>%
  distinct(geo, time, .keep_all = TRUE)

4.4 Investment Data (Gross Fixed Capital Formation)

investment_raw <- get_eurostat(
  id = "sdg_08_11",
  time_format = "num"
)

investment <- investment_raw %>%
  select(geo, TIME_PERIOD, values) %>%
  rename(
    time = TIME_PERIOD,
    investment_rate = values
  ) %>%
  filter(time >= 2013, time <= 2023) %>%
  distinct(geo, time, .keep_all = TRUE)

5 Panel Construction

This section merges all variables into a unified panel dataset for the EU-27 countries and computes the dependent variable (GDP per capita growth).

5.1 EU-27 Countries

eu27 <- c(
"AT","BE","BG","HR","CY","CZ","DK","EE","FI","FR","DE",
"EL","HU","IE","IT","LV","LT","LU","MT","NL","PL","PT",
"RO","SK","SI","ES","SE"
)

# Country names for better visualization
country_names <- c(
  "AT" = "Austria", "BE" = "Belgium", "BG" = "Bulgaria", "HR" = "Croatia",
  "CY" = "Cyprus", "CZ" = "Czechia", "DK" = "Denmark", "EE" = "Estonia",
  "FI" = "Finland", "FR" = "France", "DE" = "Germany", "EL" = "Greece",
  "HU" = "Hungary", "IE" = "Ireland", "IT" = "Italy", "LV" = "Latvia",
  "LT" = "Lithuania", "LU" = "Luxembourg", "MT" = "Malta", "NL" = "Netherlands",
  "PL" = "Poland", "PT" = "Portugal", "RO" = "Romania", "SK" = "Slovakia",
  "SI" = "Slovenia", "ES" = "Spain", "SE" = "Sweden"
)

panel_data <- gdp_pc %>%
  distinct(geo, time, .keep_all = TRUE) %>%
  filter(geo %in% eu27) %>%
  inner_join(
    migration %>% distinct(geo, time, .keep_all = TRUE) %>% filter(geo %in% eu27),
    by = c("geo", "time")
  ) %>%
  inner_join(
    unemployment %>% distinct(geo, time, .keep_all = TRUE) %>% filter(geo %in% eu27),
    by = c("geo", "time")
  ) %>%
  inner_join(
    investment %>% distinct(geo, time, .keep_all = TRUE) %>% filter(geo %in% eu27),
    by = c("geo", "time")
  ) %>%
  arrange(geo, time) %>%
  group_by(geo) %>%
  mutate(
    gdp_pc_growth = 100 * (log(gdp_pc) - log(dplyr::lag(gdp_pc))),
    country_name = country_names[geo]
  ) %>%
  ungroup() %>%
  filter(!is.na(gdp_pc_growth))

6 Final Dataset Check

This section verifies the panel dimensions, checks for balance, and confirms the final sample structure before analysis.

panel_data %>%
summarise(
countries = n_distinct(geo),
years = n_distinct(time),
observations = n()
)
## # A tibble: 1 × 3
##   countries years observations
##       <int> <int>        <int>
## 1        27    10          269
table(table(panel_data$geo))
## 
##  9 10 
##  1 26
panel_data %>%
  count(geo, name = "n_obs") %>%
  arrange(n_obs)
## # A tibble: 27 × 2
##    geo   n_obs
##    <chr> <int>
##  1 BG        9
##  2 AT       10
##  3 BE       10
##  4 CY       10
##  5 CZ       10
##  6 DE       10
##  7 DK       10
##  8 EE       10
##  9 EL       10
## 10 ES       10
## # ℹ 17 more rows

Note on panel balance: The dataset is nearly balanced, with 26 of 27 countries contributing the full 10 years of observations (2014–2023). Bulgaria has 9 observations due to one missing year in the source data. This minor imbalance does not affect the validity of the panel estimators, as both Fixed Effects and Random Effects accommodate unbalanced panels. No observations are dropped or imputed to address this issue.

7 Descriptive Statistics

This section presents summary statistics, variance decomposition (between vs. within), and correlation analysis to characterize the data before regression modeling.

7.1 Summary Statistics

desc_stats <- panel_data %>%
  select(gdp_pc_growth, migration_rate, unemployment_rate, investment_rate) %>%
  pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>%
  group_by(Variable) %>%
  summarise(
    N = n(),
    Mean = round(mean(Value, na.rm = TRUE), 2),
    SD = round(sd(Value, na.rm = TRUE), 2),
    Min = round(min(Value, na.rm = TRUE), 2),
    Q1 = round(quantile(Value, 0.25, na.rm = TRUE), 2),
    Median = round(median(Value, na.rm = TRUE), 2),
    Q3 = round(quantile(Value, 0.75, na.rm = TRUE), 2),
    Max = round(max(Value, na.rm = TRUE), 2),
    Skewness = round(moments::skewness(Value, na.rm = TRUE), 2),
    Kurtosis = round(moments::kurtosis(Value, na.rm = TRUE), 2),
    .groups = "drop"
  )

kable(desc_stats, caption = "Summary Statistics of Key Variables") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  footnote(general = "Skewness: 0 = symmetric; Kurtosis: 3 = normal distribution")
Summary Statistics of Key Variables
Variable N Mean SD Min Q1 Median Q3 Max Skewness Kurtosis
gdp_pc_growth 269 2.27 3.75 -12.15 0.64 2.25 4.43 21.07 -0.05 6.54
investment_rate 269 13.10 4.10 5.31 11.11 12.98 14.78 49.42 3.76 30.71
migration_rate 269 2.19 3.48 -4.49 0.18 1.36 3.01 17.70 1.66 6.77
unemployment_rate 269 7.60 4.09 2.00 5.10 6.60 8.70 26.60 1.99 7.96
Note:
Skewness: 0 = symmetric; Kurtosis: 3 = normal distribution

7.2 Between vs. Within Variation

For panel data analysis, it’s crucial to understand the decomposition of variance:

  • Between variation: Differences across countries (cross-sectional)
  • Within variation: Changes over time within each country (temporal)
# Calculate overall, between, and within statistics
variance_decomp <- panel_data %>%
  select(geo, gdp_pc_growth, migration_rate, unemployment_rate, investment_rate) %>%
  pivot_longer(-geo, names_to = "Variable", values_to = "Value") %>%
  group_by(Variable) %>%
  summarise(
    `Overall Mean` = round(mean(Value, na.rm = TRUE), 2),
    `Overall SD` = round(sd(Value, na.rm = TRUE), 2),
    .groups = "drop"
  )

# Between variation (variation of country means)
between_var <- panel_data %>%
  group_by(geo) %>%
  summarise(
    gdp_pc_growth = mean(gdp_pc_growth, na.rm = TRUE),
    migration_rate = mean(migration_rate, na.rm = TRUE),
    unemployment_rate = mean(unemployment_rate, na.rm = TRUE),
    investment_rate = mean(investment_rate, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  pivot_longer(-geo, names_to = "Variable", values_to = "Value") %>%
  group_by(Variable) %>%
  summarise(
    `Between SD` = round(sd(Value, na.rm = TRUE), 2),
    .groups = "drop"
  )

# Within variation (variation around country means)
within_var <- panel_data %>%
  group_by(geo) %>%
  mutate(
    gdp_pc_growth_dm = gdp_pc_growth - mean(gdp_pc_growth, na.rm = TRUE),
    migration_rate_dm = migration_rate - mean(migration_rate, na.rm = TRUE),
    unemployment_rate_dm = unemployment_rate - mean(unemployment_rate, na.rm = TRUE),
    investment_rate_dm = investment_rate - mean(investment_rate, na.rm = TRUE)
  ) %>%
  ungroup() %>%
  summarise(
    gdp_pc_growth = sd(gdp_pc_growth_dm, na.rm = TRUE),
    migration_rate = sd(migration_rate_dm, na.rm = TRUE),
    unemployment_rate = sd(unemployment_rate_dm, na.rm = TRUE),
    investment_rate = sd(investment_rate_dm, na.rm = TRUE)
  ) %>%
  pivot_longer(everything(), names_to = "Variable", values_to = "Within SD") %>%
  mutate(`Within SD` = round(`Within SD`, 2))

# Combine all
var_decomp_table <- variance_decomp %>%
  left_join(between_var, by = "Variable") %>%
  left_join(within_var, by = "Variable")

kable(var_decomp_table, caption = "Variance Decomposition: Overall, Between, and Within") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  footnote(
    general = "Between SD: cross-country variation; Within SD: within-country (temporal) variation",
    general_title = "Note: "
  )
Variance Decomposition: Overall, Between, and Within
Variable Overall Mean Overall SD Between SD Within SD
gdp_pc_growth 2.27 3.75 1.51 3.44
investment_rate 13.10 4.10 3.50 2.23
migration_rate 2.19 3.48 2.53 2.43
unemployment_rate 7.60 4.09 3.54 2.15
Note:
Between SD: cross-country variation; Within SD: within-country (temporal) variation

Interpretation:

  • If Between SD > Within SD: Most variation is across countries → Random Effects may be appropriate
  • If Within SD > Between SD: Most variation is over time → Fixed Effects can identify effects well

7.3 Distribution of Variables

panel_data %>%
  select(gdp_pc_growth, migration_rate, unemployment_rate, investment_rate) %>%
  pivot_longer(everything(), names_to = "Variable", values_to = "Value") %>%
  ggplot(aes(x = Variable, y = Value, fill = Variable)) +
  geom_boxplot(alpha = 0.7, outlier.color = "red", outlier.shape = 1) +
  geom_jitter(width = 0.2, alpha = 0.1, size = 0.5) +
  facet_wrap(~ Variable, scales = "free", ncol = 2) +
  labs(
    title = "Distribution of Key Variables",
    subtitle = "Boxplots with individual observations",
    x = "",
    y = "Value"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    legend.position = "none",
    strip.text = element_text(face = "bold"),
    plot.title = element_text(face = "bold"),
    axis.text.x = element_blank()
  ) +
  scale_fill_brewer(palette = "Set2")

7.4 Statistics by Country

country_stats <- panel_data %>%
  group_by(Country = country_name) %>%
  summarise(
    `Years` = n(),
    `Avg GDP Growth` = round(mean(gdp_pc_growth, na.rm = TRUE), 2),
    `SD GDP Growth` = round(sd(gdp_pc_growth, na.rm = TRUE), 2),
    `Avg Migration Rate` = round(mean(migration_rate, na.rm = TRUE), 2),
    `Avg Unemployment` = round(mean(unemployment_rate, na.rm = TRUE), 2),
    `Avg Investment` = round(mean(investment_rate, na.rm = TRUE), 2),
    .groups = "drop"
  ) %>%
  arrange(desc(`Avg GDP Growth`))

kable(country_stats, caption = "Summary Statistics by Country (2014-2023)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  scroll_box(height = "400px")
Summary Statistics by Country (2014-2023)
Country Years Avg GDP Growth SD GDP Growth Avg Migration Rate Avg Unemployment Avg Investment
Ireland 10 6.71 7.05 4.17 6.86 26.22
Romania 10 3.98 3.07 -0.83 6.34 13.02
Malta 10 3.90 5.15 9.82 4.36 13.02
Bulgaria 9 3.84 2.77 0.35 7.24 13.05
Poland 10 3.78 2.23 0.13 4.77 10.05
Croatia 10 3.74 5.18 -1.44 10.03 12.90
Lithuania 10 3.55 2.41 0.55 7.58 13.04
Cyprus 10 3.38 4.03 3.52 9.77 8.24
Hungary 10 3.34 3.36 0.99 4.58 14.76
Latvia 10 2.74 3.00 -0.92 8.20 14.09
Slovenia 10 2.68 3.18 1.34 6.03 11.54
Slovakia 10 2.35 2.44 0.15 7.99 12.92
Czechia 10 1.88 3.04 2.89 3.25 15.50
Portugal 10 1.88 4.07 2.27 8.86 12.40
Estonia 10 1.85 4.01 3.03 6.13 15.89
Greece 10 1.63 4.69 -0.08 19.08 6.73
Spain 10 1.52 5.11 3.18 16.84 13.52
Netherlands 10 1.36 2.76 3.01 5.47 10.68
Denmark 10 1.27 2.09 1.70 5.54 13.88
Italy 10 1.21 4.44 1.30 10.30 10.57
Belgium 10 1.15 2.74 2.39 6.72 15.23
Sweden 10 0.91 1.98 3.10 7.54 16.07
Germany 10 0.80 2.26 3.66 3.60 11.80
France 10 0.76 3.53 1.05 8.80 11.90
Austria 10 0.58 3.21 3.59 5.66 15.24
Finland 10 0.54 1.86 1.77 7.94 12.68
Luxembourg 10 0.02 2.39 8.32 5.75 8.88

7.5 Statistics by Year

year_stats <- panel_data %>%
  group_by(Year = time) %>%
  summarise(
    Countries = n(),
    `Avg GDP Growth` = round(mean(gdp_pc_growth, na.rm = TRUE), 2),
    `SD GDP Growth` = round(sd(gdp_pc_growth, na.rm = TRUE), 2),
    `Avg Migration Rate` = round(mean(migration_rate, na.rm = TRUE), 2),
    `Avg Unemployment` = round(mean(unemployment_rate, na.rm = TRUE), 2),
    `Avg Investment` = round(mean(investment_rate, na.rm = TRUE), 2),
    .groups = "drop"
  )

kable(year_stats, caption = "Summary Statistics by Year (EU-27 Average)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Summary Statistics by Year (EU-27 Average)
Year Countries Avg GDP Growth SD GDP Growth Avg Migration Rate Avg Unemployment Avg Investment
2014 27 2.09 2.15 0.94 10.84 12.10
2015 27 3.33 3.97 1.22 9.99 12.51
2016 27 2.34 1.39 1.18 8.99 13.05
2017 27 3.73 2.43 1.48 7.87 13.17
2018 27 3.05 1.90 1.98 6.82 12.86
2019 27 2.48 1.53 2.10 6.21 14.27
2020 27 -4.38 3.55 1.57 7.02 13.53
2021 27 7.03 2.70 1.80 6.64 13.01
2022 27 2.75 2.67 6.10 5.77 13.28
2023 26 0.17 2.41 3.60 5.82 13.27

Notable observations:

  • 2020: COVID-19 pandemic caused significant GDP decline across all countries
  • 2021: Strong recovery (“bounce-back effect”)
  • 2022: Recovery continued despite energy crisis and inflation

7.6 Correlation Matrix

cor_vars <- panel_data %>%
  select(gdp_pc_growth, migration_rate, unemployment_rate, investment_rate) %>%
  na.omit()

cor_matrix <- cor(cor_vars)

corrplot(cor_matrix, 
         method = "color", 
         type = "upper",
         addCoef.col = "black",
         tl.col = "black",
         tl.srt = 45,
         col = colorRampPalette(c("#D73027", "white", "#1A9850"))(100),
         title = "Correlation Matrix",
         mar = c(0, 0, 2, 0))

Interpretation: - Negative correlation between unemployment and GDP growth suggests that higher unemployment is associated with lower economic growth. - Investment rate typically shows a positive correlation with GDP growth, reflecting the importance of capital formation. - The relationship between migration and GDP growth can be examined further in the regression analysis.

8 Data Visualization

This section provides graphical exploration of the data, including time series plots, cross-country comparisons, and scatter plots of bivariate relationships.

8.1 GDP per Capita Growth Over Time

ggplot(panel_data, aes(x = time, y = gdp_pc_growth, color = country_name)) +
  geom_line(alpha = 0.7, linewidth = 0.8) +
  geom_point(size = 1.5, alpha = 0.6) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray40", linewidth = 0.8) +
  labs(
    title = "GDP per Capita Growth in EU-27 Countries (2014-2023)",
    subtitle = "Annual percentage change (log difference × 100)",
    x = "Year",
    y = "GDP per Capita Growth (%)",
    color = "Country"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    legend.position = "right",
    legend.text = element_text(size = 8),
    plot.title = element_text(face = "bold")
  ) +
  scale_x_continuous(breaks = seq(2014, 2023, 1)) +
  scale_color_viridis_d(option = "turbo")

8.3 Average GDP Growth by Country

avg_growth <- panel_data %>%
  group_by(geo, country_name) %>%
  summarise(
    avg_growth = mean(gdp_pc_growth, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  arrange(desc(avg_growth))

ggplot(avg_growth, aes(x = reorder(country_name, avg_growth), y = avg_growth, fill = avg_growth)) +
  geom_col() +
  coord_flip() +
  geom_hline(yintercept = mean(avg_growth$avg_growth), linetype = "dashed", color = "red", linewidth = 0.8) +
  scale_fill_gradient2(low = "#D73027", mid = "#FFFFBF", high = "#1A9850", midpoint = 0) +
  labs(
    title = "Average GDP per Capita Growth by Country (2014-2023)",
    subtitle = "Red dashed line = EU-27 average",
    x = "",
    y = "Average Annual GDP Growth (%)",
    fill = "Growth"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    legend.position = "none",
    plot.title = element_text(face = "bold")
  )

8.4 Scatter Plots: Bivariate Relationships

8.4.1 Migration Rate vs. GDP Growth

ggplot(panel_data, aes(x = migration_rate, y = gdp_pc_growth)) +
  geom_point(aes(color = country_name), alpha = 0.6, size = 2) +
  geom_smooth(method = "lm", se = TRUE, color = "black", linetype = "dashed", linewidth = 1) +
  geom_hline(yintercept = 0, linetype = "dotted", color = "gray50") +
  geom_vline(xintercept = 0, linetype = "dotted", color = "gray50") +
  labs(
    title = "Migration Rate vs. GDP per Capita Growth",
    subtitle = "Each point = one country-year observation; dashed line = OLS fit",
    x = "Migration Rate (per 1,000 inhabitants)",
    y = "GDP per Capita Growth (%)",
    color = "Country"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    legend.position = "right",
    legend.text = element_text(size = 7),
    plot.title = element_text(face = "bold")
  ) +
  scale_color_viridis_d(option = "turbo")

8.4.2 Unemployment Rate vs. GDP Growth

ggplot(panel_data, aes(x = unemployment_rate, y = gdp_pc_growth)) +
  geom_point(aes(color = country_name), alpha = 0.6, size = 2) +
  geom_smooth(method = "lm", se = TRUE, color = "black", linetype = "dashed", linewidth = 1) +
  geom_hline(yintercept = 0, linetype = "dotted", color = "gray50") +
  labs(
    title = "Unemployment Rate vs. GDP per Capita Growth",
    subtitle = "Each point = one country-year observation; dashed line = OLS fit",
    x = "Unemployment Rate (%)",
    y = "GDP per Capita Growth (%)",
    color = "Country"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    legend.position = "right",
    legend.text = element_text(size = 7),
    plot.title = element_text(face = "bold")
  ) +
  scale_color_viridis_d(option = "turbo")

8.4.3 Investment Rate vs. GDP Growth

ggplot(panel_data, aes(x = investment_rate, y = gdp_pc_growth)) +
  geom_point(aes(color = country_name), alpha = 0.6, size = 2) +
  geom_smooth(method = "lm", se = TRUE, color = "black", linetype = "dashed", linewidth = 1) +
  geom_hline(yintercept = 0, linetype = "dotted", color = "gray50") +
  labs(
    title = "Investment Rate vs. GDP per Capita Growth",
    subtitle = "Each point = one country-year observation; dashed line = OLS fit",
    x = "Investment Rate (% of GDP)",
    y = "GDP per Capita Growth (%)",
    color = "Country"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    legend.position = "right",
    legend.text = element_text(size = 7),
    plot.title = element_text(face = "bold")
  ) +
  scale_color_viridis_d(option = "turbo")

8.5 Evolution of Key Variables: Selected Countries

To provide a more granular view of the data underlying the regression analysis, the figure below traces the temporal evolution of the three main explanatory variables—migration rate, unemployment rate, and investment rate—for ten representative EU economies. The selection spans core Western European economies (Germany, France, Netherlands), Southern European states (Spain, Italy, Portugal, Greece), Northern Europe (Sweden), Central-Eastern Europe (Poland), and a high-growth outlier (Ireland). This diversity captures the heterogeneity in economic structures and trajectories that motivates the panel data approach.

# Select representative countries across EU regions
selected_countries <- c("DE", "FR", "ES", "IT", "PL", "NL", "PT", "SE", "EL", "IE")
selected_names <- c(
  "DE" = "Germany", "FR" = "France", "ES" = "Spain", "IT" = "Italy",
  "PL" = "Poland", "NL" = "Netherlands", "PT" = "Portugal", "SE" = "Sweden",
  "EL" = "Greece", "IE" = "Ireland"
)

# Prepare data for plotting - use GDP growth rate (%) for consistency with Figure 7.1
evolution_data <- panel_data %>%
  filter(geo %in% selected_countries) %>%
  mutate(country_label = selected_names[geo]) %>%
  select(country_label, time, migration_rate, unemployment_rate, investment_rate, gdp_pc_growth)

# Separate data for explanatory variables
expl_vars_data <- evolution_data %>%
  select(country_label, time, migration_rate, unemployment_rate, investment_rate) %>%
  pivot_longer(
    cols = c(migration_rate, unemployment_rate, investment_rate),
    names_to = "Variable",
    values_to = "Value"
  ) %>%
  mutate(
    Variable = case_when(
      Variable == "migration_rate" ~ "Migration Rate (per 1,000)",
      Variable == "unemployment_rate" ~ "Unemployment Rate (%)",
      Variable == "investment_rate" ~ "Investment Rate (% of GDP)"
    ),
    Variable = factor(Variable, levels = c(
      "Migration Rate (per 1,000)",
      "Unemployment Rate (%)",
      "Investment Rate (% of GDP)"
    ))
  )

# GDP growth data for secondary axis
gdp_data <- evolution_data %>%
  select(country_label, time, gdp_pc_growth)

# Scaling for secondary axis: GDP growth typically ranges from -15 to +25
# Map to similar range as other variables (~0-30)
gdp_scale_factor <- 1  # GDP growth is already in percentage points, similar scale
gdp_shift <- 0  # No shift needed

# Create faceted line plot with dual axis
ggplot() +
  # Explanatory variables (left axis)
  geom_line(data = expl_vars_data, 
            aes(x = time, y = Value, color = Variable, linetype = Variable),
            linewidth = 1) +
  geom_point(data = expl_vars_data, 
             aes(x = time, y = Value, color = Variable),
             size = 2, alpha = 0.8) +
  # GDP growth (right axis)
  geom_line(data = gdp_data,
            aes(x = time, y = gdp_pc_growth),
            color = "#7B3294", linewidth = 1.2, alpha = 0.8) +
  geom_point(data = gdp_data,
             aes(x = time, y = gdp_pc_growth),
             color = "#7B3294", size = 2.5, shape = 17, alpha = 0.8) +
  geom_hline(yintercept = 0, linetype = "dashed", color = "gray40", linewidth = 0.5) +
  facet_wrap(~ country_label, ncol = 2, scales = "free_y") +
  scale_x_continuous(breaks = seq(2014, 2023, 2)) +
  scale_y_continuous(
    name = "Value (Migration, Unemployment, Investment)",
    sec.axis = sec_axis(
      ~ .,
      name = "GDP per Capita Growth (%)"
    )
  ) +
  scale_color_manual(
    values = c(
      "Migration Rate (per 1,000)" = "#2166AC",
      "Unemployment Rate (%)" = "#B2182B",
      "Investment Rate (% of GDP)" = "#1B7837"
    )
  ) +
  scale_linetype_manual(
    values = c(
      "Migration Rate (per 1,000)" = "solid",
      "Unemployment Rate (%)" = "dashed",
      "Investment Rate (% of GDP)" = "dotted"
    )
  ) +
  labs(
    title = "Evolution of Key Variables and GDP per Capita Growth (2014–2023)",
    subtitle = "Selected EU economies | Purple triangles: GDP per capita growth (%, right axis)",
    x = "Year",
    color = "Explanatory Variables",
    linetype = "Explanatory Variables"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    legend.position = "bottom",
    legend.title = element_text(face = "bold"),
    strip.text = element_text(face = "bold", size = 11),
    plot.title = element_text(face = "bold"),
    plot.subtitle = element_text(color = "gray40"),
    panel.grid.minor = element_blank(),
    axis.text.x = element_text(angle = 45, hjust = 1),
    axis.title.y.right = element_text(color = "#7B3294", face = "bold"),
    axis.text.y.right = element_text(color = "#7B3294")
  ) +
  guides(
    color = guide_legend(nrow = 1),
    linetype = guide_legend(nrow = 1)
  )
Evolution of key explanatory variables and GDP per capita growth (2014–2023) for ten representative EU economies. Each panel displays one country; lines represent the migration rate (per 1,000 inhabitants), unemployment rate (%), and investment rate (% of GDP) on the left axis, with GDP per capita growth (%) shown on the right axis. The plot illustrates cross-country heterogeneity in variable trajectories and highlights common shocks such as the COVID-19 pandemic (2020–2021).

Evolution of key explanatory variables and GDP per capita growth (2014–2023) for ten representative EU economies. Each panel displays one country; lines represent the migration rate (per 1,000 inhabitants), unemployment rate (%), and investment rate (% of GDP) on the left axis, with GDP per capita growth (%) shown on the right axis. The plot illustrates cross-country heterogeneity in variable trajectories and highlights common shocks such as the COVID-19 pandemic (2020–2021).

The figure reveals substantial cross-country heterogeneity in variable levels and trajectories. The COVID-19 shock (2020) is visible as a sharp GDP contraction followed by V-shaped recovery across all countries. These differences in levels and dynamics motivate the use of fixed effects to absorb time-invariant country characteristics.

9 Econometric Analysis

This section estimates the panel regression models (Pooled OLS, Fixed Effects, Random Effects) to test the hypotheses regarding the determinants of GDP growth.

9.1 Panel Data Setup

pdata <- pdata.frame(
  panel_data,
  index = c("geo", "time")
)
pdim(pdata)
## Unbalanced Panel: n = 27, T = 9-10, N = 269

9.2 Pooled OLS Model

pooled_ols <- plm(
  gdp_pc_growth ~ migration_rate + unemployment_rate + investment_rate,
  data = pdata,
  model = "pooling"
)

summary(pooled_ols)
## Pooling Model
## 
## Call:
## plm(formula = gdp_pc_growth ~ migration_rate + unemployment_rate + 
##     investment_rate, data = pdata, model = "pooling")
## 
## Unbalanced Panel: n = 27, T = 9-10, N = 269
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -14.172765  -1.482294   0.024949   1.923257  17.995276 
## 
## Coefficients:
##                    Estimate Std. Error t-value Pr(>|t|)  
## (Intercept)        1.581061   1.033860  1.5293  0.12739  
## migration_rate    -0.137823   0.068770 -2.0041  0.04608 *
## unemployment_rate -0.041585   0.060438 -0.6880  0.49202  
## investment_rate    0.099503   0.057474  1.7313  0.08457 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    3767.2
## Residual Sum of Squares: 3666.1
## R-Squared:      0.026842
## Adj. R-Squared: 0.015825
## F-statistic: 2.43644 on 3 and 265 DF, p-value: 0.065094

9.3 Fixed Effects Model

9.3.1 Why Two-Way Fixed Effects?

The sample period 2014–2023 includes several major global shocks that affected all EU countries simultaneously:

  • COVID-19 pandemic (2020): An unprecedented exogenous shock causing synchronized GDP contractions across the EU, followed by a strong rebound in 2021.
  • Energy crisis and inflation surge (2022): The war in Ukraine triggered energy price spikes and inflationary pressures affecting all member states.
  • European debt crisis aftermath (2014–2015): Lingering effects of the sovereign debt crisis, particularly in Southern Europe.

A one-way fixed effects model (controlling only for country heterogeneity) would attribute these common time shocks to the explanatory variables, leading to omitted variable bias. For instance, if unemployment rose across all countries in 2020 due to COVID-19 (a common shock), a one-way FE model might incorrectly estimate a spuriously strong negative relationship between unemployment and growth.

Two-way fixed effects address this by including both:

  1. Country fixed effects (\(\alpha_i\)): Control for time-invariant country characteristics (institutions, geography, culture)
  2. Time fixed effects (\(\lambda_t\)): Control for year-specific shocks common to all countries (COVID-19, energy crisis, EU-wide policies)

The model specification becomes: \[\text{GDP Growth}_{it} = \beta_1 \text{Migration}_{it} + \beta_2 \text{Unemployment}_{it} + \beta_3 \text{Investment}_{it} + \alpha_i + \lambda_t + \varepsilon_{it}\]

This ensures that the coefficients \(\beta\) capture only the within-country, within-year variation in the explanatory variables—purged of both country-specific and time-specific confounders.

9.3.2 One-Way Fixed Effects (Country Only)

For comparison, we first estimate the one-way FE model:

fe_model_oneway <- plm(
  gdp_pc_growth ~ migration_rate + unemployment_rate + investment_rate,
  data = pdata,
  model = "within",
  effect = "individual"
)

summary(fe_model_oneway)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = gdp_pc_growth ~ migration_rate + unemployment_rate + 
##     investment_rate, data = pdata, effect = "individual", model = "within")
## 
## Unbalanced Panel: n = 27, T = 9-10, N = 269
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -13.83399  -0.68165   0.35786   1.33627  13.65233 
## 
## Coefficients:
##                    Estimate Std. Error t-value Pr(>|t|)
## migration_rate    -0.159450   0.098284 -1.6223   0.1060
## unemployment_rate -0.104525   0.112914 -0.9257   0.3555
## investment_rate   -0.107810   0.100876 -1.0687   0.2863
## 
## Total Sum of Squares:    3175.1
## Residual Sum of Squares: 3125
## R-Squared:      0.015784
## Adj. R-Squared: -0.10364
## F-statistic: 1.2776 on 3 and 239 DF, p-value: 0.28269

9.3.3 Two-Way Fixed Effects (Country + Time)

fe_model <- plm(
  gdp_pc_growth ~ migration_rate + unemployment_rate + investment_rate,
  data = pdata,
  model = "within",
  effect = "twoways"
)

summary(fe_model)
## Twoways effects Within Model
## 
## Call:
## plm(formula = gdp_pc_growth ~ migration_rate + unemployment_rate + 
##     investment_rate, data = pdata, effect = "twoways", model = "within")
## 
## Unbalanced Panel: n = 27, T = 9-10, N = 269
## 
## Residuals:
##      Min.   1st Qu.    Median   3rd Qu.      Max. 
## -9.396520 -0.841384 -0.048067  0.895848 12.921548 
## 
## Coefficients:
##                    Estimate Std. Error t-value Pr(>|t|)   
## migration_rate    -0.219847   0.068905 -3.1906 0.001618 **
## unemployment_rate -0.286854   0.100952 -2.8415 0.004894 **
## investment_rate   -0.029063   0.061346 -0.4737 0.636128   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    1150
## Residual Sum of Squares: 1074.1
## R-Squared:      0.066034
## Adj. R-Squared: -0.088273
## F-statistic: 5.42054 on 3 and 230 DF, p-value: 0.0012781

9.3.4 Test for Significance of Time Effects

To formally test whether time fixed effects are necessary, we compare the two-way FE model against the one-way FE model:

  • H0: Time effects are jointly zero (one-way FE is adequate)
  • H1: Time effects are significant (two-way FE is needed)
f_test_time <- pFtest(fe_model, fe_model_oneway)
f_test_time
## 
##  F test for twoways effects
## 
## data:  gdp_pc_growth ~ migration_rate + unemployment_rate + investment_rate
## F = 48.799, df1 = 9, df2 = 230, p-value < 2.2e-16
## alternative hypothesis: significant effects
## **Interpretation:** The F-test for time effects is significant (p = 1.6e-48 ). Time fixed effects are jointly significant.
##  The two-way FE model is preferred, confirming that common shocks (e.g., COVID-19) must be controlled for.

9.4 Random Effects Model

The Random Effects model assumes that country-specific effects are uncorrelated with the regressors and can be treated as random draws from a population distribution. This allows for more efficient estimation when the assumption holds.

Note on two-way Random Effects: While the Fixed Effects model above uses a two-way specification (country + time), the standard Random Effects estimator in plm defaults to one-way (individual) effects. Two-way RE models are less commonly implemented and require additional assumptions. For the Hausman test comparison, we use the one-way RE model, acknowledging that this is a limitation. If the Hausman test rejects RE in favor of FE, the two-way FE model remains the preferred specification.

re_model <- plm(
  gdp_pc_growth ~ migration_rate + unemployment_rate + investment_rate,
  data = pdata,
  model = "random"
)

summary(re_model)
## Oneway (individual) effect Random Effect Model 
##    (Swamy-Arora's transformation)
## 
## Call:
## plm(formula = gdp_pc_growth ~ migration_rate + unemployment_rate + 
##     investment_rate, data = pdata, model = "random")
## 
## Unbalanced Panel: n = 27, T = 9-10, N = 269
## 
## Effects:
##                   var std.dev share
## idiosyncratic 13.0753  3.6160  0.95
## individual     0.6927  0.8323  0.05
## theta:
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1771  0.1915  0.1915  0.1910  0.1915  0.1915 
## 
## Residuals:
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## -1.40e+01 -1.26e+00  7.45e-02 -6.35e-04  1.77e+00  1.75e+01 
## 
## Coefficients:
##                    Estimate Std. Error z-value Pr(>|z|)  
## (Intercept)        2.054506   1.159880  1.7713  0.07651 .
## migration_rate    -0.142243   0.074254 -1.9156  0.05541 .
## unemployment_rate -0.052171   0.068218 -0.7648  0.44441  
## investment_rate    0.070375   0.064358  1.0935  0.27418  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    3563.4
## Residual Sum of Squares: 3495.8
## R-Squared:      0.018991
## Adj. R-Squared: 0.0078856
## Chisq: 5.07167 on 3 DF, p-value: 0.16662

10 Diagnostic Tests

This section conducts the standard specification tests to guide model selection.

10.1 Breusch-Pagan LM Test: RE vs. Pooled OLS

The Breusch-Pagan Lagrange Multiplier test examines whether there are significant individual effects that justify using Random Effects over Pooled OLS.

  • H0: No individual effects (Pooled OLS is adequate)
  • H1: Significant individual effects (Random Effects is needed)
bp_lm_test <- plmtest(pooled_ols, type = "bp")
bp_lm_test
## 
##  Lagrange Multiplier Test - (Breusch-Pagan)
## 
## data:  gdp_pc_growth ~ migration_rate + unemployment_rate + investment_rate
## chisq = 1.5204, df = 1, p-value = 0.2176
## alternative hypothesis: significant effects
## **Interpretation:** The BP-LM test is not significant (p = 0.2176 ). We fail to reject H0.
##  Pooled OLS may be adequate, as no significant individual effects are detected.

10.2 Step 1: Test for Heteroskedasticity

Before conducting the Hausman test, we must first diagnose whether heteroskedasticity is present in the data. This is critical because the validity of the standard Hausman test depends on the assumption of homoskedastic errors.

  • H0: Homoskedasticity (constant variance of errors)
  • H1: Heteroskedasticity (non-constant variance of errors)
bp_hetero_test <- bptest(fe_model, data = pdata, studentize = TRUE)
bp_hetero_test
## 
##  studentized Breusch-Pagan test
## 
## data:  fe_model
## BP = 1.6987, df = 3, p-value = 0.6372
## **Interpretation:** The Breusch-Pagan test does not reject homoskedasticity (p = 0.6372 ).
##  Standard errors and the standard Hausman test are valid.

10.3 Step 2: Hausman Tests (Standard vs. Robust)

The Hausman test examines whether the country-specific effects are correlated with the regressors.

  • H0: Random effects model is consistent and efficient (use RE)
  • H1: Fixed effects model is consistent (use FE)

10.3.1 Standard Hausman Test

hausman_test <- phtest(fe_model, re_model)
hausman_test
## 
##  Hausman Test
## 
## data:  gdp_pc_growth ~ migration_rate + unemployment_rate + investment_rate
## chisq = 28.271, df = 3, p-value = 3.187e-06
## alternative hypothesis: one model is inconsistent

10.3.2 Robust Hausman Test (Heteroskedasticity-Consistent)

hausman_robust <- phtest(fe_model, re_model, vcov = vcovHC)
hausman_robust
## 
##  Hausman Test
## 
## data:  gdp_pc_growth ~ migration_rate + unemployment_rate + investment_rate
## chisq = 28.271, df = 3, p-value = 3.187e-06
## alternative hypothesis: one model is inconsistent
## ### Comparison of Hausman Test Results
## | Test Version | Chi-Squared | P-Value | Decision |
## |--------------|-------------|---------|----------|
## | Standard     | 28.271 | 0 | Reject H0 (Use FE) |
## | Robust (vcovHC) | 28.271 | 0 | Reject H0 (Use FE) |
## 
## **Interpretation:** Since homoskedasticity was not rejected in Step 1, the Standard Hausman test is valid. The test indicates that the **Fixed Effects model** is preferred (p = 0 < 0.05).

10.4 Step 3: Conclusion on Standard Errors

## Since homoskedasticity was not rejected, standard errors are valid. However, as a conservative approach, we may still report cluster-robust standard errors for robustness.
## 
## The preferred model based on the Standard Hausman test is: **Fixed Effects**.

10.5 F-Test for Fixed Effects

Testing whether the fixed effects (country effects) are jointly significant compared to Pooled OLS:

  • H0: All individual effects are zero (Pooled OLS is adequate)
  • H1: Fixed effects are significant (FE is needed)
f_test <- pFtest(fe_model, pooled_ols)
f_test
## 
##  F test for twoways effects
## 
## data:  gdp_pc_growth ~ migration_rate + unemployment_rate + investment_rate
## F = 15.859, df1 = 35, df2 = 230, p-value < 2.2e-16
## alternative hypothesis: significant effects
## **Interpretation:** The F-test is significant (p = 6.54e-44 ). Country-specific effects are jointly significant.
##  Pooled OLS is inappropriate; we need to account for individual heterogeneity.

10.6 Summary of Diagnostic Tests

# Determine preferred model based on heteroskedasticity-adjusted logic
if (bp_hetero_test$p.value < 0.05) {
  # Heteroskedasticity present: use robust Hausman
  preferred_model <- ifelse(hausman_robust$p.value < 0.05, "Fixed Effects", "Random Effects")
  hausman_used <- "Robust"
} else {
  # No heteroskedasticity: use standard Hausman
  preferred_model <- ifelse(hausman_test$p.value < 0.05, "Fixed Effects", "Random Effects")
  hausman_used <- "Standard"
}

diagnostic_results <- data.frame(
  Step = c("1", "2a", "2b", "3", "4"),
  Test = c("Breusch-Pagan (Heteroskedasticity)", 
           "Hausman - Standard (FE vs. RE)",
           "Hausman - Robust (FE vs. RE)",
           "Breusch-Pagan LM (RE vs. OLS)", 
           "F-Test (FE vs. OLS)"),
  Statistic = c(round(bp_hetero_test$statistic, 3),
                round(hausman_test$statistic, 3),
                round(hausman_robust$statistic, 3),
                round(bp_lm_test$statistic, 3),
                round(f_test$statistic, 3)),
  `P-Value` = c(round(bp_hetero_test$p.value, 4),
                round(hausman_test$p.value, 4),
                round(hausman_robust$p.value, 4),
                round(bp_lm_test$p.value, 4),
                format(f_test$p.value, scientific = TRUE, digits = 2)),
  Decision = c(ifelse(bp_hetero_test$p.value < 0.05, "Heteroskedasticity Present", "Homoskedastic"),
               ifelse(hausman_test$p.value < 0.05, "Use FE", "Use RE"),
               ifelse(hausman_robust$p.value < 0.05, "Use FE", "Use RE"),
               ifelse(bp_lm_test$p.value < 0.05, "Use Panel Model", "OLS OK"),
               ifelse(f_test$p.value < 0.05, "FE needed", "Pooled OK"))
)

kable(diagnostic_results, caption = "Summary of Diagnostic Tests (Logical Sequence)") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  column_spec(5, bold = TRUE) %>%
  row_spec(ifelse(bp_hetero_test$p.value < 0.05, 3, 2), background = "#e6f3ff")
Summary of Diagnostic Tests (Logical Sequence)
Step Test Statistic P.Value Decision
1 Breusch-Pagan (Heteroskedasticity) 1.699 0.6372 Homoskedastic
2a Hausman - Standard (FE vs. RE) 28.271 0 Use FE
2b Hausman - Robust (FE vs. RE) 28.271 0 Use FE
3 Breusch-Pagan LM (RE vs. OLS) 1.520 0.2176 OLS OK
4 F-Test (FE vs. OLS) 15.859 6.5e-44 FE needed
## ### Final Model Selection
## **Diagnostic Logic:**
## 1. **Heteroskedasticity Test (Step 1):**  Failed to reject homoskedasticity → Standard errors are valid.
## 2. **Hausman Test (Step 2):** Based on Step 1, we rely on the **Standard Hausman test**.
## 3. **Conclusion:** The **Fixed Effects** model is the preferred specification.

11 Model Comparison

This section compares the three estimators (Pooled OLS, FE, RE) side-by-side.

stargazer(
  pooled_ols,
  fe_model,
  re_model,
  type = "html",
  title = "Panel Regression Results",
  dep.var.labels = "GDP per capita growth",
  column.labels = c("Pooled OLS", "Fixed Effects", "Random Effects")
)
Panel Regression Results
Dependent variable:
GDP per capita growth
Pooled OLS Fixed Effects Random Effects
(1) (2) (3)
migration_rate -0.138** -0.220*** -0.142*
(0.069) (0.069) (0.074)
unemployment_rate -0.042 -0.287*** -0.052
(0.060) (0.101) (0.068)
investment_rate 0.100* -0.029 0.070
(0.057) (0.061) (0.064)
Constant 1.581 2.055*
(1.034) (1.160)
Observations 269 269 269
R2 0.027 0.066 0.019
Adjusted R2 0.016 -0.088 0.008
F Statistic 2.436* (df = 3; 265) 5.421*** (df = 3; 230) 5.072
Note: p<0.1; p<0.05; p<0.01

12 Results Interpretation

12.1 Coefficient Interpretation (Fixed Effects Model)

Migration Rate: β = -0.22 (significant at 5% level). A 1 unit increase in net migration rate is associated with a 0.22 percentage point decrease in GDP growth.

Unemployment Rate: β = -0.287 (significant at 5% level). A 1 percentage point increase in unemployment is associated with a 0.287 percentage point decrease in GDP growth. This is consistent with Okun’s Law.

Investment Rate: β = -0.029 (not significant). The investment-growth relationship is not statistically significant.

13 Limitations

The results of this analysis should be interpreted with caution. Several limitations affect the validity of causal inference:

  1. Endogeneity and reverse causality: The relationships between GDP growth and the explanatory variables (migration, unemployment, investment) are likely bidirectional. For example, GDP growth may attract migrants and stimulate investment, while simultaneously being affected by these factors. Fixed effects control for time-invariant country characteristics but do not address contemporaneous simultaneity.

  2. Structural breaks: The sample period (2014–2023) includes the COVID-19 pandemic (2020–2021), which introduced extraordinary variation in GDP growth. While this variation aids statistical identification, the estimated relationships may be influenced by crisis-period dynamics.

  3. Omitted variables: Important growth determinants such as human capital, trade openness, and institutional quality are not included in the model, potentially biasing the estimates.

Advanced econometric methods (such as instrumental variables or GMM) could address some of these concerns but are beyond the scope of this introductory analysis. The findings should therefore be interpreted as statistical associations rather than causal effects.

14 Conclusion

14.1 Summary

This analysis examined the determinants of GDP per capita growth in the EU-27 using panel data methods. The standard diagnostic tests guided model selection:

  • The Breusch-Pagan LM test did not reject the null hypothesis, suggesting that pooled OLS may be adequate.
  • The Hausman test rejected the null hypothesis, indicating that the Fixed Effects model is preferred.
  • The F-test confirmed the significance of country-specific fixed effects.

14.2 Main Findings

Based on the Fixed Effects model:

  • Unemployment has a significant negative relationship with GDP growth (β = -0.287), consistent with H1 and Okun’s Law.
  • Investment does not show a statistically significant relationship with GDP growth.
  • Migration shows a significant relationship with GDP growth (β = -0.22).

These results should be interpreted as correlations rather than causal effects due to the limitations discussed above.


15 Appendix

15.1 Data Sources

All data were retrieved from Eurostat:

15.2 Export Data for Submission

save(panel_data, file = "eu_growth_panel.RData")

cat("Exported dataset: eu_growth_panel.RData\n")
## Exported dataset: eu_growth_panel.RData
cat("Observations:", nrow(panel_data), "\n")
## Observations: 269
cat("Countries:", n_distinct(panel_data$geo), "\n")
## Countries: 27
cat("Sample period: 2014-2023\n")
## Sample period: 2014-2023